智能论文笔记

A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors

Nikhil Ghosh , Mikhail Belkin

分类： (统计)机器学习 | 机器学习

2022-07-23

在这项工作中，我们建立了一种算法和分布分布在模型大小，过剩测试损失和线性预测因子的训练损失之间的独立非质合权权衡。具体而言，我们表明，在测试数据上表现良好的模型要么是“经典” - 训练损失接近噪声水平，要么是“现代” - 比较了更多的参数。完全适合培训数据所需的最低限度。当白色特征的限制光谱分布是Marchenko-Pastur时，我们还提供了更精确的渐近分析。值得注意的是，虽然Marchenko-Pastur分析在插值峰附近更为精确，但参数的数量足以适应训练数据，但在最实际的利益的设置中，它与分布与仅由适度的乘法常数限制的分布有所不同。

translated by 谷歌翻译

Deconstructing Distributions: A Pointwise Framework of Learning

Gal Kaplun , Nikhil Ghosh , Saurabh Garg , Boaz Barak , Preetum Nakkiran

分类：机器学习 | 人工智能 | 计算机视觉 | (统计)机器学习

2022-02-20

在机器学习中，我们传统上评估单个模型的性能，平均在测试输入集合中进行平均。在这项工作中，我们提出了一种新方法：在$ \ textit {单个输入点} $上评估时，我们测量了模型集合的性能。具体来说，我们研究了一个点的$ \ textit {profile {profile} $：模型在测试分布上的平均性能与他们在该点上的角度表现之间的关系。我们发现配置文件可以在分布和分发的模型和数据的结构中产生新的见解。例如，我们从经验上表明，实际数据分布由具有质量不同的点组成。一方面，有“兼容”点，在角度和平均性能之间具有很强的相关性。另一方面，有些点具有弱甚至$ \ textit {nogate} $相关性：提高整体模型精度实际上$ \ textit {hurts} $性能的情况。我们证明，这些实验观察与先前工作中提出的几种简化学习模型的预测不一致。作为一个应用程序，我们使用配置文件来构造一个数据集，我们称为CIFAR-10-NENG：CINIC-10的子集，因此对于标准模型，CIFAR-10-NENG上的准确性为$ \ textit {negalissiper {negalissiperational {negalishatied} CIFAR-10测试。这首先说明了一个完全逆转“准确性”的OOD数据集（Miller，Taori，Raghunathan，Sagawa，Koh，Koh，Shankar，Liang，Carmon和Schmidt 2021）

translated by 谷歌翻译

The Three Stages of Learning Dynamics in High-Dimensional Kernel Methods

Nikhil Ghosh , Song Mei , Bin Yu

分类： (统计)机器学习 | 机器学习

2021-11-13

要了解深度学习的作品，了解神经网络的培训动态至关重要。关于这些动态的几个有趣的假设是基于经验观察到的现象，但存在有限的理论上了解此类现象的时间和原因。在本文中，我们考虑了内核最小二乘目标对梯度流动的培训动态，这是SGD培训的神经网络的限制动态。使用精确的高维渐近学，我们将拟合模型的动态表征在两个“世界”中：在甲骨文世界中，该模型在人口分布和实证世界中培训，模型在采样的数据集上培训。我们展示在内核的温和条件下，$ L ^ 2 $目标回归函数，培训动力学经历三个阶段，其特征在于两个世界的模型的行为。我们的理论结果也在数学上正式化一些有趣的深度学习现象。具体而言，在我们的环境中，我们展示了SGD逐步了解更多复杂的功能，并且存在“深度引导”现象：在第二阶段，尽管经验训练误差要小得多，但两个世界的测试错误仍然接近。最后，我们提供了一个具体的例子，比较了两种不同核的动态，这表明更快的培训不需要更好地推广。

translated by 谷歌翻译

Mapping smallholder cashew plantations to inform sustainable tree crop expansion in Benin

Leikun Yin , Rahul Ghosh , Chenxi Lin , David Hale , Christoph Weigl , James Obarowski , Junxiong Zhou , Jessica Till , Xiaowei Jia , Troy Mao

分类：计算机视觉 | 机器学习

2023-01-01

Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.

translated by 谷歌翻译

Logic Mill -- A Knowledge Navigation System

Sebastian Erhardt , Mainak Ghosh , Erik Buunk , Michael E. Rose , Dietmar Harhoff

分类：自然语言处理

2022-12-31

Logic Mill is a scalable and openly accessible software system that identifies semantically similar documents within either one domain-specific corpus or multi-domain corpora. It uses advanced Natural Language Processing (NLP) techniques to generate numerical representations of documents. Currently it leverages a large pre-trained language model to generate these document representations. The system focuses on scientific publications and patent documents and contains more than 200 million documents. It is easily accessible via a simple Application Programming Interface (API) or via a web interface. Moreover, it is continuously being updated and can be extended to text corpora from other domains. We see this system as a general-purpose tool for future research applications in the social sciences and other domains.

translated by 谷歌翻译

Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents

Sayar Ghosh Roy , Anshul Padhi , Risubh Jain , Manish Gupta , Vasudeva Varma

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-31

Multiple studies have focused on predicting the prospective popularity of an online document as a whole, without paying attention to the contributions of its individual parts. We introduce the task of proactively forecasting popularities of sentences within online news documents solely utilizing their natural language content. We model sentence-specific popularity forecasting as a sequence regression task. For training our models, we curate InfoPop, the first dataset containing popularity labels for over 1.7 million sentences from over 50,000 online news documents. To the best of our knowledge, this is the first dataset automatically created using streams of incoming search engine queries to generate sentence-level popularity annotations. We propose a novel transfer learning approach involving sentence salience prediction as an auxiliary task. Our proposed technique coupled with a BERT-based neural model exceeds nDCG values of 0.8 for proactive sentence-specific popularity forecasting. Notably, our study presents a non-trivial takeaway: though popularity and salience are different concepts, transfer learning from salience prediction enhances popularity forecasting. We release InfoPop and make our code publicly available: https://github.com/sayarghoshroy/InfoPopularity

translated by 谷歌翻译

Using Machine Learning to Determine Morphologies of $z<1$ AGN Host Galaxies in the Hyper Suprime-Cam Wide Survey

Chuan Tian , C. Megan Urry , Aritra Ghosh , Ryan Ofman , Tonima Tasnim Ananna , Connor Auge , Nico Cappelluti , Meredith C. Powell , David B. Sanders , Kevin Schawinski

分类：机器学习

2022-12-20

We present a machine-learning framework to accurately characterize morphologies of Active Galactic Nucleus (AGN) host galaxies within $z<1$. We first use PSFGAN to decouple host galaxy light from the central point source, then we invoke the Galaxy Morphology Network (GaMorNet) to estimate whether the host galaxy is disk-dominated, bulge-dominated, or indeterminate. Using optical images from five bands of the HSC Wide Survey, we build models independently in three redshift bins: low $(0<z<0.25)$, medium $(0.25<z<0.5)$, and high $(0.5<z<1.0)$. By first training on a large number of simulated galaxies, then fine-tuning using far fewer classified real galaxies, our framework predicts the actual morphology for $\sim$ $60\%-70\%$ host galaxies from test sets, with a classification precision of $\sim$ $80\%-95\%$, depending on redshift bin. Specifically, our models achieve disk precision of $96\%/82\%/79\%$ and bulge precision of $90\%/90\%/80\%$ (for the 3 redshift bins), at thresholds corresponding to indeterminate fractions of $30\%/43\%/42\%$. The classification precision of our models has a noticeable dependency on host galaxy radius and magnitude. No strong dependency is observed on contrast ratio. Comparing classifications of real AGNs, our models agree well with traditional 2D fitting with GALFIT. The PSFGAN+GaMorNet framework does not depend on the choice of fitting functions or galaxy-related input parameters, runs orders of magnitude faster than GALFIT, and is easily generalizable via transfer learning, making it an ideal tool for studying AGN host galaxy morphology in forthcoming large imaging survey.

translated by 谷歌翻译

An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations

Shelly Jain , Priyanshi Pal , Anil Vuppala , Prasanta Ghosh , Chiranjeevi Yarra

分类：自然语言处理

2022-12-19

Speech systems are sensitive to accent variations. This is especially challenging in the Indian context, with an abundance of languages but a dearth of linguistic studies characterising pronunciation variations. The growing number of L2 English speakers in India reinforces the need to study accents and L1-L2 interactions. We investigate the accents of Indian English (IE) speakers and report in detail our observations, both specific and common to all regions. In particular, we observe the phonemic variations and phonotactics occurring in the speakers' native languages and apply this to their English pronunciations. We demonstrate the influence of 18 Indian languages on IE by comparing the native language pronunciations with IE pronunciations obtained jointly from existing literature studies and phonetically annotated speech of 80 speakers. Consequently, we are able to validate the intuitions of Indian language influences on IE pronunciations by justifying pronunciation rules from the perspective of Indian language phonology. We obtain a comprehensive description in terms of universal and region-specific characteristics of IE, which facilitates accent conversion and adaptation of existing ASR and TTS systems to different Indian accents.

translated by 谷歌翻译

LaSQuE: Improved Zero-Shot Classification from Explanations Through Quantifier Modeling and Curriculum Learning

Sayan Ghosh , Rakesh R Menon , Shashank Srivastava

分类：自然语言处理

2022-12-18

A hallmark of human intelligence is the ability to learn new concepts purely from language. Several recent approaches have explored training machine learning models via natural language supervision. However, these approaches fall short in leveraging linguistic quantifiers (such as 'always' or 'rarely') and mimicking humans in compositionally learning complex tasks. Here, we present LaSQuE, a method that can learn zero-shot classifiers from language explanations by using three new strategies - (1) modeling the semantics of linguistic quantifiers in explanations (including exploiting ordinal strength relationships, such as 'always' > 'likely'), (2) aggregating information from multiple explanations using an attention-based mechanism, and (3) model training via curriculum learning. With these strategies, LaSQuE outperforms prior work, showing an absolute gain of up to 7% in generalizing to unseen real-world classification tasks.

translated by 谷歌翻译

Cartographer_glass: 2D Graph SLAM Framework using LiDAR for Glass Environments

Lasitha Weerakoon , Gurtajbir Singh Herr , Jasmine Blunt , Miao Yu , Nikhil Chopra

分类：机器人

2022-12-16

We study algorithms for detecting and including glass objects in an optimization-based Simultaneous Localization and Mapping (SLAM) algorithm in this work. When LiDAR data is the primary exteroceptive sensory input, glass objects are not correctly registered. This occurs as the incident light primarily passes through the glass objects or reflects away from the source, resulting in inaccurate range measurements for glass surfaces. Consequently, the localization and mapping performance is impacted, thereby rendering navigation in such environments unreliable. Optimization-based SLAM solutions, which are also referred to as Graph SLAM, are widely regarded as state of the art. In this paper, we utilize a simple and computationally inexpensive glass detection scheme for detecting glass objects and present the methodology to incorporate the identified objects into the occupancy grid maintained by such an algorithm (Google Cartographer). We develop both local (submap level) and global algorithms for achieving the objective mentioned above and compare the maps produced by our method with those produced by an existing algorithm that utilizes particle filter based SLAM.

translated by 谷歌翻译